Formalizing Causality as a Desideratum for Memory Models and Transformations of Parallel Programs

نویسندگان

  • Chen Chen
  • Wenguang Chen
  • Vugranam Sreedhar
  • Rajkishore Barik
  • Vivek Sarkar
  • Guang Gao
چکیده

It has been observed in previous work that it is desirable to avoid causal violations in any execution or transformation of a parallel program. In this paper, we formalize the notion of causality in memory consistency models and code transformations. For memory models, we introduce a framework of causality graph that can be used to analyze if a particular memory model violates causality. We show that a popular memory model as the Java memory model (JMM) [16], can lead to program executions that exhibit causality violations with respect to our definition of causality. The same analysis appears to also apply to a recent proposal of C++ specification [7] where the underline memory model may also lead to similar problems. For code transformations, we identify transformations that are causality-preserving and those that are potentially causality-violating. We found that 10 of the 13 code transformation examples that were identified as causality-preserving with respect to the Java Memory Model fail our causality graph test and thus represent causality violations in our framework. Likewise, we also present examples to illustrate how the recently proposed C++ Memory Model can lead to potential causality violations. Using our formalization, we establish causality as a desideratum for memory models and code transformations of parallel programs and define a Causal Memory Model (CMM) as the weakest memory model that preserves causality. We identify specific code transformations that are guaranteed to be causality-preserving. Finally, we present preliminary experimental results for a load elimination optimization to motivate the performance benefit of using the CMMmodel relative to the Sequential Consistency (SC) model. For the benchmark program studied, the number of getfield operations performed was reduced by 37.9% by using the CMM model instead of the SC model, and the execution time on a 16-core processor was reduced by 46.2%.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Establishing Causality as a Desideratum for Memory Models and Transformations of Parallel Programs

In this paper, we establish a notion of causality that should be used as a desideratum for memory models and code transformations of parallel programs. We introduce a Causal Acyclic Consistency (CAC) model which is weak enough to allow various useful code transformations, yet still strong enough to prevent any execution that exhibits “causal cycles” that may be caused by the Java Memory Model (...

متن کامل

An Approach to Parallelizing Fortran Programs using Rewriting Rules Technique

We present an ongoing research in the area of transforming existing sequential Fortran programs into their parallel equivalents. Our approach is to use rewriting rules technique in order to automate the transformation process. Sequential source code is transformed into parallel code for shared-memory systems, such as multicore processors. Parallelizing and optimizing transformations are formall...

متن کامل

Formalizing an Ssa-based Compiler for Verified Advanced Program Transformations

FORMALIZING AN SSA-BASED COMPILER FOR VERIFIED ADVANCED PROGRAMTRANSFORMATIONSJianzhou ZhaoSupervisor: Steve Zdancewic Compilers are not always correct due to the complexity of language semantics and transformation algo-rithms, the trade-offs between compilation speed and verifiability, etc. The bugs of compilers can underminethe source-level verification efforts (such a...

متن کامل

Transformations for the Optimistic Parallel Execution of Object-oriented Programs

This paper discusses the use of optimistic execution as a mechanism for parallelizing sequential object-oriented programs. Most parallelizing compilers to date have used compile-time data-dependency analysis to determine independent sections of code. This reliance on static information presents an overly restrictive view of dependencies in a program. In this paper, a set of transformations is p...

متن کامل

A Message-Passing Distributed Memory Parallel Algorithm for a Dual-Code Thin Layer, Parabolized Navier-Stokes Solver

In this study, the results of parallelization of a 3-D dual code (Thin Layer, Parabolized Navier-Stokes solver) for solving supersonic turbulent flow around body and wing-body combinations are presented. As a serial code, TLNS solver is very time consuming and takes a large part of memory due to the iterative and lengthy computations. Also for complicated geometries, an exceeding number of grid...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009